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Abstract — Recently, researchers showed that dirty paper cod- 
ing (DPC) is the optimal transmission strategy for multiple-input 
multiple-output broadcast channels (MIMO-BC). In this paper, 
we study how to determine the maximum weighted sum of DPC 
rates through solving the maximum weighted sum rate problem 
of the dual MIMO multiple access channel (MIMO-MAC) with a 
sum power constraint. We first simplify the maximum weighted 
sum rate problem such that enumerating all possible decoding 
orders in the dual MIMO-MAC is unnecessary. We then design an 
efficient algorithm based on conjugate gradient projection (CGP) 
to solve the maximum weighted sum rate problem. Our proposed 
CGP method utilizes the powerful concept of Hessian conjugacy. 
We also develop a rigorous algorithm to solve the projection 
problem. We show that CGP enjoys provable convergence, nice 
scalability, and great efficiency for large MIMO-BC systems. 

I. Introduction 

The capacity region of multiple-input multiple-output broad- 
cast channels (MIMO-BC) has received great attention in 
recent years. MIMO-BC belong to the class of nondegraded 
broadcast channels, for which the capacity region is noto- 
riously hard to analyze [1]. Very recently, researchers have 
made significant progress in this area. Most notably, Weigarten 
et. al. finally proved the long-open conjecture in [2] that 
the "dirty paper coding" (DPC) strategy is the capacity 
achieving transmission strategy for MIMO-BC. Moreover, by 
the remarkable channel duality between MIMO-BC and its 
dual MIMO multiple access channel (MIMO-MAC) [3]-[5], 
the nonconvex MIMO-BC capacity region (with respect to 
the input covariance matrices) can be transformed to the 
convex dual MIMO-MAC capacity region with a sum power 
constraint. 

In this paper, we study how to determine the maximum 
weighted sum of DPC rates (MWSR) of MIMO-BC through 
solving the maximum weighted sum rate problem of the dual 
MIMO-MAC. Important applications of the MWSR problem 
of MIMO-BC include but are not limited to applying La- 
grangian dual decomposition for the cross-layer optimization 
for MIMO-based mesh networks [6], The MWSR problem 
of MIMO-BC is the general case of the maximum sum rate 
problem (MSR) of MIMO-BC, which has been solved by 
using various algorithms in the literature. Such algorithms 
include the minimax method (MM) by Lan and Yu [7], the 
steepest descent (SD) method by Viswanathan et al. [8], the 
dual decomposition (DD) method by Yu [9], two iterative 
water-filling methods (IWFs) by Jindal et al. [10], and the 



conjugate gradient projection method recently proposed by 
us [11]. Among these algorithms, IWFs and CGP appear to 
be the simplest. However, all of these existing algorithms 
have limitations in that they cannot be readily extended to 
the MWSR problem of MIMO-BC. As we show later, the 
objective function of the MWSR problem has a very different 
and much more complex objective function. The aforemen- 
tioned algorithms can only handle the objective function of 
MSR, which is just a special case of MWSR (by setting all 
weights to one). These limitations of the existing algorithms 
motivate us to design an efficient and scalable algorithm with 
a modest storage requirement to solve the MWSR problem of 
large MIMO-BC systems. 

In this paper, we significantly extend our CGP method 
in [11] to handle the MWSR problem of MIMO-BC. Our 
CGP method is inspired by [12], where a gradient projection 
method was used to heuristically solve the MSR problem of 
MIMO interference channels. However, unlike [12], we use the 
conjugate gradient directions instead of gradient directions to 
eliminate the "zigzagging" phenomenon. Also different from 
[12], we develop a rigorous algorithm to exactly solve the 
projection problem. Our main contributions in this paper are 
three-fold: 

1) To the best of our knowledge, our paper is the first 
work that considers the MWSR problem of MIMO-BC. 
Studying the MWSR problem is more useful and more 
important because the MWSR problem is the general 
case of MSR, and it has much wider application in 
systems and networks that employ MIMO-BC. 

2) We simplify the MWSR problem of the dual MIMO- 
MAC such that enumerating all different decoding orders 
in the dual MIMO-MAC is unnecessary, thus paving 
the way to design an algorithm to efficiently solve the 
MWSR problem of MIMO-BC. 

3) We extend the CGP method in [11] for the MWSR 
problem of MIMO-BC. This extended CGP method still 
enjoys provable convergence as well as nice scalability, 
and has the desirable linear complexity. Also, the ex- 
tended CGP method is insensitive to the increase of the 
number of users and has a modest memory requirement. 

The remainder of this paper is organized as follows. In 
Section II, we discuss the network model and the problem 
formulation. Section III introduces the key components in of 



CGP, including the computation of conjugate gradients and 
performing projection. We analyze the complexity of CGP in 
Section IV. Numerical results of CGP's convergence behavior 
and performance comparison with other existing algorithms 
are presented in Section V. Section VI concludes this paper. 

II. System Model and Problem Formulation 

We begin with introducing notations. We use boldface to 
denote matrices and vectors. For a complex- valued matrix A, 
A* and A^ denote the conjugate and the conjugate transpose 
of A, respectively. Tr{A} denotes the trace of A. We let I 
denote the identity matrix with dimension determined from 
context. A >z represents that A is Hermitian and positive 
semidefinite (PSD). DiagjAi . . . A„} denotes the block diag- 
onal matrix with matrices Ai, . . . , A„ on its main diagonal. 

Suppose that a MIMO Gaussian broadcast channel has K 
users, each of which is equipped with n r antennas, and the 
transmitter has n t antennas. The channel matrix for user i 
is denoted as EL e C" rX ' lt . In [2], it has been shown that 
the capacity region of MIMO-BC is equal to the dirty-paper 
coding region (DPC). In DPC rate region, suppose that users 
1 , . . . , K are encoded subsequently, then the rate of user i can 
be computed as [3] 
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where I\ G C HXnt , i = 1,...,K, are the downlink input 
covariance matrices, T = {T\, . . . Tk} denotes the collection 
of all the downlink covariance matrices. As a result, the 
MWSR problem can then be written as follows: 

Maximize T,?=i u i R ? FC ( T ) 

subject to r,^0, i=l,...,K (2) 

EtiTr(r 4 )<P, 

where m is the weight of user i, P represents the maximum 
transmit power at the transmitter. It is evident that (2) is a 
nonconvex optimization problem since the DPC rate equation 
in (1) is neither a concave nor a convex function in the 
input covariance matrices Ti, . . . ,Tk- However, the authors 
in [3] showed that due to the duality between MIMO-BC 
and MIMO-MAC, the rates achievable in MIMO-BC are also 
achievable in MIMO-MAC. That is, given a feasible T, there 
exists a set of feasible uplink input covariance matrices for 
the dual MIMO-MAC, denoted by Q, such that Rf AC (Q) = 
i?P PC (r). Thus, (2) is equivalent to the following maximum 
weighted sum rate problem of the dual MIMO-MAC with a 
sum power constraint: 

Maximize E 4 =i AC (Q) 
subject to Qi y 0, i = 1, . . . , K 

i?r AC (Q)GC MA c( J P,Ht), i = l,...,K 



(3) 



Ef=iTr(Q,)<P, 



where Qi 6 C nrXnr , i = l,...,K, are the uplink input 
covariance matrices, Q = {Qi,...Qk} represents the col- 
lection of all the uplink covariance matrices, Cmac(P, EL) 



represents the capacity region of the dual MIMO-MAC. It 
is known that the capacity region of a MIMO-MAC can be 
achieved by the successive decoding [1]. However, in order to 
determine the capacity region of a MIMO-MAC, all possible 
successive decoding orders need to be enumerated, which is 
very cumbersome. In the following theorem, however, we 
show that the enumeration of all successive decoding orders 
is indeed unnecessary when solving the MWSR problem of 
the dual MIMO-MAC. This result significantly reduces the 
complexity and paves the way to efficiently solve the MWSR 
problem by using CGP method. 

Theorem 1: The MWSR problem in (3) can be solved by 
the following equivalent optimization problem: 

Maximize ELK(.) -*M»-i))* 

logdct (i + J2f =t H^Q^H^)) (4) 

subject to Ei=iTr(Q t ) < 

Qt h0, i = i,...,K, 

where u x ( ) — 0, n is a permutation of the set {1, . . . , K} 
such that < ■ • ■ < u^^ K y = 1, . . . , K, represents 

the i th position in permutation n. 

Proof: Let $(5) = logdct(I + E ie<s H^Q^H^)), 
where S is a non-empty subset of {1, ... , K}. From Theorem 
14.3.5 in [1], we know that the maximum weighted sum rate 
problem can be written as 

Maximize E^i^ W -R^ C 

subject to E, e5 Rf ( f < VS C {1, . . . , 

Thus, it is not difficult to see that, when S = R^ff < 

= logdetfl + H^Q^H^A Since < 
. . . < u v t K \, from Karush-Kuhn-Tucker (KKT) condition, we 
must have that the constraint R^jq < &({ir(K)}) must be 
tight at optimality. That is, 

= log det (i + Hi (J0 Q, (J0 H, (Jf) ) . (5) 

Likewise, when S — {ir(K — 1) 7 tt(K)}, we have 



)Qn(jf-i)H T (}f-i)j • 

So, from (5), we have 

^ C _D < logdct (i + H^Q^H^ 

+ H 7r (K_i)Q7r(Ar-i)H 7r(K -i)) - 



(6) 



log det I + HT „Q 



Since u v /k-i) is the second largest weight, again from KKT 
condition, we must have that (6) must be tight at optimality. 
This process continues for all K users. Subsequently, we have 
that 



i?MAC = lQgdct fj + E f =i H t u) Q vU)U 



- logdct (i + Ef =i+1 HtyjQ^.jH.y)) , 



(7) 



for i = 1, . . . , K — 1. Summing up all u^^R^uP and after 
rearranging the terms, it is readily verifiable that 

Yh=\ U n(i) R ™(if = Y,f=i{ u ^(i) - *M»-i))x 

log dot (i + H^Q^H^)^ 

It then follows that the MWSR problem of the dual MIMO- 
MAC is equivalent to maximizing (8) with the sum power 
constraint, i.e., the optimization problem in (4). ■ 
An important observation from (4) is that, since log det (•) is 
a concave function for positive semidefinite matrices [1], (4) is 
a convex optimization problem with respect to the uplink input 
covariance matrices Q^(i), . . . , Q n (K)- However, although the 
standard interior point convex optimization method can be 
used to solve (4), it is considerably more complex than a 
method that exploits the special structure of (4). 

III. Conjugate Gradient Projection Method 

In this paper, we modified the conjugate gradient projection 
method (CGP) in [11] to solve (4). CGP utilizes the important 
and powerful concept of Hessian conjugacy to deflect the 
gradient direction appropriately so as to achieve the superlinear 
convergence rate [13]. The framework of CGP for solving (4) 
is shown in Algorithm 1. 

Algorithm 1 Gradient Projection Method 
Initialization: 

Choose the initial conditions Q(°) = [Q^, Qi, 0) , . . . , Q^'] T . Let 
k = 0. 
Main Loop: 

(k) 

1. Calculate the conjugate gradients , i = 1, 2, . . . , K. 

2. Choose an appropriate step size s k - Let Q/ fc ^ = + s^G^, 
fori = 1,2, ...,K. 

3. Let Q< fe ) be the projection of Q'W onto Q+(P), where Q+(P) = 
{Qi, i = 1, . . . , K\Qi y 0, E£i Tr{Qi} < P}. 

ffc-r-1) (k) / ~ (k) 

4. Choose appropriate step size a&. Let QJ ' = Q£ +afc(Q| — 
Qf ) ),!=l,2,...,K. 

( k) 

5. k = k + 1. If the maximum absolute value of the elements in — 



Q 



(fc-i) 



< e, for i = 1, 2, . . . , L, then stop; else go to step 1. 



Due to the complexity of the objective function in (4), 
we adopt the inexact line search method called "Armijo's 
Rule" to avoid excessive objective function evaluations, while 
still enjoying provable convergence [13]. The basic idea of 
Armijo's Rule is that at each step of the line search, we 
sacrifice accuracy for efficiency as long as we have sufficient 
improvement. According to Armijo's Rule, in the k th iteration, 
we choose a k — \ and a k — j3 mk (the same as in [12]), where 
m k is the first non-negative integer m that satisfies 

F(Q (fc+1) ) - F(Q (fc) ) > ff f(G w , Q (fe) - Q (fc) ) 

K 



where < j3 < 1 and < a < 1 are fixed scalars. 

Next, we will consider two major components in the CGP 
framework: 1) how to compute the conjugate gradient direction 
Gi, and 2) how to project Q'^ onto the set fi+(P) = 



A. Computing the Conjugate Gradients 

The gradient G w y) = Vq x0) F(Q) depends on the partial 
derivative of F(Q) with respect to Qwj). By using the 
formula aindct(A+Bxc) = [ C(A + bxCJ^B] 7 [12], [14], 
we can compute the partial derivative of the i th term in the 
summation of F(Q) with respect to Q w (j), j > i, as follows: 

« / 



log dot ^ + E H t(fe)Q-W H -(fe)) ) 



To compute the gradient of F(Q) with respect to Q w u), we 
notice that only the first j terms in F(Q) involve Q n (j). From 
the definition V 2 /(z) = 2(df(z)/dz)* [15], we have 



G 



if 



E _ M 7r(«-1)) X 

i=l 



k—i 



0)- 



(9) 



It is worth to point out that we can exploit the special 
structure in (9) to significantly reduce the computation com- 
plexity in the implementation of the algorithm. Note that the 
most difficult part in computing G v u\ is the summation of 
the terms in the form of H^^Q^^jH^^. Without careful 
consideration, one may end up computing such additions 
j(2K+l—j)/2 times for Gwj). However, noting that most of 
the terms in the summation are still the same when j varies, we 
can maintain a running sum for I + Ylk=i H l(fe)Q 7 r(fe)H 7r ( fe ), 
start out from j = K, and reduce j by one sequentially. As a 
result, only one new term is added to the running sum in each 
iteration, which means we only need to do the addition once 
in each iteration. 

The conjugate gradient direction in the m th iteration can 
be computed as G^ = G^ + ^G^ 1 '. We adopt the 
Fletcher and Reeves' choice of deflection [13], which can be 
computed as 



IG 



Pm = 



("») 1 1 2 



IG 



(m-l). 



(10) 



(8) The purpose of deflecting the gradient using (10) is to find 
G^™-), which is the Hessian-conjugate of G^^ X \ By doing 



{Qi, i = 1, . . . , K\Qih 0, Eti Tr {QJ < P}- 



T(j)' 

so, we can eliminate the "zigzagging" phenomenon encoun- 
tered in the conventional gradient projection method, and 
achieve the superlinear convergence rate [13] without actually 
storing a large Hessian approximation matrix as in quasi- 
Newton methods. 



B. Projection onto il+(P) 

' (k) 

Noting from (9) that is Hermitian, we have that Q - v ; = 

(k) (k) 

Hermitian as well. Then, the projection 
problem becomes how to simultaneously project a set of 
K Hermitian matrices onto the set 0+(P), which contains 
a constraint on sum power for all users. This is different 
to [12], where the projection was performed on individual 
power constraint. In order to do this, we construct a block 
diagonal matrix D = Diag{Qi . . . Q^} € £(K-n r )y.(K-n r ) _ ft 
is easy to recognize that Qi G fl + (P), i = 1, . . . , K, only 
if Tr(D) = Yn=i Tr (Q») < P and D ^ 0. In this paper, 
we use Frobenius norm, denoted by || • \\p, as the matrix 
distance metric. The distance between two matrices A and 
B is defined as ||A - B|| F = (Tr [(A - B)t(A - B)]) 3 . 
Thus, given a block diagonal matrix D, we wish to find a 
matrix D G fl + (P) such that D minimizes ||D — T)\\ F . For 
more convenient algebraic manipulations, we instead study the 
following equivalent optimization problem: 

Minimize |||D-D||| 
subject to Tr (D) < P, D h 0. 

In (11), the objective function is convex in D, the constraint 
D y represents the convex cone of positive semidefinite 
matrices, and the constraint Tr(D) < P is a linear constraint. 
Thus, the problem is a convex minimization problem and we 
can exactly solve this problem by solving its Lagrangian dual 
problem. Associating Hermitian matrix X to the constraint 
D y and fi to the constraint Tr(D) < P, we can write the 
Lagrangian as 

5 (X, M ) = min{(l/2)||D-D||2,-Tr(XtD) 

D L 

+ /i(Tr(D)-p)}. (12) 

Since <?(X, /i) is an unconstrained quadratic minimization 
problem, we can compute the minimizer of (12) by simply 
setting the derivative of (12) (with respect to D) to zero, i.e., 
(D - D) - Xt + ijI = 0. Noting that X* = X, we have 
D = D fil + X. Substituting D back into (12), we have 

g(X, M ) = i||X - nlf F - M P + Tr [( M I - X) (D + X - M I)] 

= -I||D- M I + X||^- M P+l||D|||,. (13) 
Therefore, the Lagrangian dual problem can be written as 

Maximize -±||D - fil + - fiP + ±||D||| 
subject to X^0,m>0. 

After solving (14), we can have the optimal solution to (11) 
as: 

D* = D-//I + X*, (15) 

where fi* and X* are the optimal dual solutions to Lagrangian 
dual problem in (14). Although the Lagrangian dual problem 
in (14) has a similar structure as that in the primal problem 
in (11) (having a positive semidefinitive matrix constraint), we 
find that the positive semidefinite matrix constraint can indeed 



be easily handled. To see this, we first introduce Moreau 
Decomposition Theorem from convex analysis. 

Theorem 2: (Moreau Decomposition [16]) Let /C be a 
closed convex cone. For x, xi,x 2 G C p , the two properties 
below are equivalent: 

1) x = xi + x 2 with Xi G JC, x 2 G JC° and (xi, x 2 ) = 0, 

2) xi = p/c(x) and x 2 = ptz°{x), 

where K° = {s G C p : (s,y) < 0, Vy G /C} is called the 
polar cone of cone /C, pk.{-) represents the projection onto 
cone JC. 

In fact, the projection onto a cone /C is analogous to the 
projection onto a subspace. The only difference is that the 
orthogonal subspace is replaced by the polar cone. 

Now we consider how to project a Hermitian matrix A G 
C" x ™ onto the positive and negative semidefinite cones. First, 
we can perform eigenvalue decomposition on A yielding A = 
UDiag{Ai, i = 1, . . . ,n}U^, where U is the unitary matrix 
formed by the eigenvectors corresponding to the eigenvalues 
Xi, i = 1, . . . , n. Then, we have the positive semidefinite and 
negative semidefinite projections of A as follows: 

A+ = UDiag{max{A 4 , 0}, i = 1, 2, . . . , n}XJt, (16) 
A_ = UDiag{min{A 4 , 0}, i = 1, 2, . . . , n}LJt. (17) 

The proof of (16) and (17) is a straightforward application of 
Theorem 2 by noting that A + ^ 0, A_ ^ 0, (A+,A_) = 
0, A + + A_ = A, and the positive semidefinite cone and 
negative semidefinite cone are polar cones to each other. 

We now consider the term D /il + X, which is the 
only term involving X in the dual objective function. We 
can rewrite it as D — /il — (—X), where we note that 

X ^ 0. Finding a negative semidefinite matrix —X such 
that 1 1 13 — /il — (— X)||i? is minimized is equivalent to finding 
the projection of D — /il onto the negative semidefinite cone. 
From the previous discussion, we immediately have 

-X = (D-/d)_. (18) 

Since D — /xi = (D - + (D - substituting (18) 

back to the Lagrangian dual objective function, we have 

min||D - /xI + X|| F = (D-fjI) + . (19) 

Thus, the matrix variable X in the Lagrangian dual problem 
can be removed and the Lagrangian dual problem can be 
rewritten as 

Maximize ^O) - — ^ || (I> - ^)+\\ F - A*P + |||D|| F 
subject to [i > 0. 

(20) 

Suppose that after performing eigenvalue decomposition on D, 
we have D = UAU^, where A is the diagonal matrix formed 
by the eigenvalues of D, U is the unitary matrix formed by 
the corresponding eigenvectors. Since U is unitary, we have 
(D - = U (A - fjl) + Ut. It then follows that 

||(D- M I) + ||^=||(A- M I) + ||^. (21) 

We denote the eigenvalues in A by Xi, i = 1, 2, . . . , K ■ n r . 
Suppose that we sort them in non-increasing order such that 



A = Diag{Ai A 2 ... A K . Br }, where Ai > . . . > X K -n r - It 
then follows that 

K-n r 

||(A- M I) + ||i= E (max{0,A, - M }) 2 . (22) 

i=i 

From (22), we can rewrite tp(fi) as 

, K-nr , 

W = "2 E (niax{0,A,- M }) 2 - M P+-||D||^. (23) 

It is evident from (23) that VKm) is continuous and (piece-wise) 
concave in fi. Generally, piece-wise concave maximization 
problems can be solved by using the subgradient method. 
However, due to the heuristic nature of its step size selec- 
tion strategy, subgradient algorithm usually does not perform 
well. In fact, by exploiting the special structure, (20) can be 
efficiently solved. We can search the optimal value of fj, as 
follows. Let / index the pieces of ip(ji), I = 0, 1, . . . , K ■ n r . 
Initially we set I = and increase / subsequently. Also, we 
introduce A = oo and A^ „ r +i = — oo. We let the endpoint 
objective value ifbj(Xo) = 0, <p* — tpj (Ao), and /j,* = An. 
If I > K ■ n r , the search stops. For a particular index /, by 
setting 

&M = i(-\P*-tf-^) = °' (24) 
we have 



Now we consider the following two cases: 

1) If /ij e [A/ +1 ,Aj=] n M+, where R+ denotes the set 
of non-negative real numbers, then we have found the 
optimal solution for fi because ip(fi) is concave in /i. 
Thus, the point having zero-value first derivative, if 
exists, must be the unique global maximum solution. 
Hence, we can let ^* = and the search is done. 

2) If pb*~ £ [A/ +1 , Xj] n M + , we must have that the local 
maximum in the interval [Aj= +1 , Xj] FlR + is achieved at 
one of the two endpoints. Note that the objective value 
tpj (A|) has been computed in the previous iteration 
because from the continuity of the objective function, 
we have "4>j (Xj) = V'j-i (A/)- Thus, we only need to 
compute the other endpoint objective value tpj (A/ +1 )- 
If ip; (A/ +1 ) < ipj (Xj) = (j)*, then we know fi* is the 
optimal solution; else let ^* = <t>* = ipj (Xj +1 ), 
1 = 1+1 and continue. 

Since there are K ■ n r + 1 intervals in total, the search process 
takes at most K ■ n r + 1 steps to find the optimal solution /j,*. 
Hence, this search is of polynomial-time complexity 0(n r K). 
After finding we can compute D* as 

D* = (D-/i*I) + = U(A-/i*I) + U t . (26) 

That is, the projection D can be computed by adjusting 
the eigenvalues of D using ^* and keeping the eigenvectors 



unchanged. The projection of D onto Q, + (P) is summarized 
in Algorithm 2. 



Algorithm 2 Projection onto fl + (P) 
Initiation: 

1. Construct a block diagonal matrix D. Perform eigenvalue decompo- 
sition D = UATjt, sort the eigenvalues in non-increasing order. 

2. Introduce Ao = oo and A K . nt + 1 = — oo. Let 1 = 0. Let the 
endpoint objective value tpj (Ao) = 0, 0* = tftj (Ao), and fj,* = Ao. 

Main Loop: 

1. If / > K ■ n r , go to the final step; else let /it = ~ P)/I- 

2. If fi*- 6 [Xj +1 , Xj] nR+, then let fi* = fit and go to the final step. 

3. Compute tpj(Xj +1 ). If 4>f(Xf +1 ) < <f>* , then go to the final step; 
else let fj,* = Xf +1 , <j>* = ^/(^/+i)> 1 = 1 + 1 and continue. 

Final Step: Compute D as D = U (A - U*. 



IV. Complexity Analysis 

In this section, we analyze the complexity of our proposed 
CGP algorithm. Similar to IWFs [10], SD [8], and DD [9], 
CGP has the desirable "linear complexity property". We list 
the complexity per iteration for each component of CGP in 
Table I. In CGP, it can be seen that the most time-consuming 

TABLE I 

Per Iteration Complexity in the Components of CGP 





CGP 


Gradient 


K 


Line Search 


O(mK) 


Projection 


0(n r K) 


Overall 


0((m + 1 + n r )K) 



part (increasing with respect to K) is the addition of the terms 
in the form of HjQiHi when computing gradients. Since the 
term (I + Y^k=i HjQiHi) can be computed by the running 
sum, we only need to compute this sum once in each iteration. 
Thus, the number of such additions per iteration for CGP 
is K. It is also obvious that the projection in each iteration 
of CGP has the complexity of 0(n r K). The complexity of 
the Armijo's rule inexact line search has the complexity of 
0(mK) (in terms of the additions of tijQiHi terms), where 
m is the number of trials in Armijo's Rule. Therefore, the 
overall complexity per iteration for CGP is 0((m+ 1 + n r )K). 
According to our computational experience, the value of m 
usually lies in between two and four. This shows that CGP 
has the linear complexity in K. 

Also, as evidenced in the next section, the numbers of 
iterations required for convergence in CGP is very insensitive 
to the increase of the number of users. Moreover, CGP has 
a modest memory requirement: It only requires the solution 
information from the previous step, as opposed to the IWFs, 
which requires previous K — 1 steps. 

V. Numerical Results 

We first use an example of a MIMO-BC system consisting 
of 10 users with n* = n r = 4 to show the convergence 
behavior of our proposed algorithm. The weights of the 10 



users are [1, 1.5, 0.8, 0.9, 1.4, 1.2, 0.7, 1.1, 1.03, 1.3], respec- 
tively. The convergence process is plotted in Fig. 1. It can 
be seen that CGP takes approximately 30 iterations to reach 
near the optimal. 
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Fig. 1. Convergence behavior of a 10-user MIMO-BC with nt = n r = 4. 

To compare the efficiency of CGP with that of IWFs, we 
give an example of an equal-weight large MIMO-BC system 
consisting of 100 users with nt = n r = 4 in here. The 
convergence processes are plotted in Fig. 2. It is observed 
from Fig. 2 that CGP takes only 29 iterations to converge 
and it outperforms both IWFs. IWFl's convergence speed 
significantly drops after the quick improvement in the early 
stage. It is also seen in this example that IWF2's performance 
is inferior to IWF1, and this observation is in accordance with 
the results in [10]. Both IWF1 and IWF2 fail to converge 
within 100 iterations. The scalability problem of both IWFs is 
not surprising because in both IWFs, the most recently updated 
covariance matrices only account for a fraction of 1/K in 
the effective channels' computation, which means it does not 
effectively make use of the most recent solution. In all of 
our numerical examples with different number of users, CGP 
always converges within 30 iterations. 
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VI. Conclusion 

In this paper, we studied the maximum weighted sum rate 
(MWSR) problem of MIMO-BC. Specifically, we derived 
the MWSR problem of the dual MIMO-MAC with a sum 
power constraint and developed an efficient algorithm based 
on conjugate gradient projection (CGP) to solve the MWSR 
problem. Also, we theoretically and numerically analyzed its 
complexity and convergence behavior. Our contributions in 
this paper are three-fold: First, this paper is the first work 
that considers the MWSR problem of MIMO-BC; Second, 
we simplified the MWSR problem in the dual MIMO-MAC 
and showed that enumerating all different decoding orders 
is unnecessary; Third, we developed an efficient and well- 
scalable algorithm based on conjugate gradient projection 
(CGP). The attractive features of CGP and encouraging results 
in this paper showed that CGP is an excellent method for 
solving the MWSR problem of large MIMO-BC systems. 
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